After going through this tutorial you should be able to:
SingleCellExperiment objectsSingleCellExperiment objectsscatercellassignThis document compiled: http://kieranrcampbell.github.io/r-workshop-march-2019
suppressPackageStartupMessages({
library(scater) # BioConductor
library(SingleCellExperiment) # BioConductor
library(DropletUtils) # BioConductor
library(tidyverse) # CRAN
library(here) # CRAN
library(DT) # CRAN
library(pheatmap) # CRAN
})
Warning: package 'tibble' was built under R version 3.5.2
Warning: package 'purrr' was built under R version 3.5.2
Warning: package 'pheatmap' was built under R version 3.5.2
knitr::opts_chunk$set(echo = TRUE,
cache = TRUE)
If you need to install any of these, you can do so by:
install.packages(c("tidyverse", "here", "DT", "pheatmap", "BiocManager"))
BiocManager::install(c("scater", "SingleCellExperiment", "DropletUtils"))
It is highly recommended you do so before the tutorial as this can take significant time!
To install cellassign, we need to install Google’s Tensorflow framework. If this doesn’t work in the tutorial - don’t worry, you won’t need it for 80% of what we cover.
install.packages("tensorflow", repos = "http://cran.us.r-project.org")
The downloaded binary packages are in
/var/folders/wh/6v_w5j853g11zfhlb7f2rcvw0000gn/T//RtmpievTcL/downloaded_packages
tensorflow::install_tensorflow()
Creating virtualenv for TensorFlow at /Users/kierancampbell/.virtualenvs/r-tensorflow
Installing TensorFlow ...
Installation complete.
install.packages("devtools", repos = "http://cran.us.r-project.org") # If not already installed
There is a binary version available but the source version is
later:
binary source needs_compilation
devtools 1.13.6 2.0.1 FALSE
installing the source package 'devtools'
devtools::install_github("Irrationone/cellassign")
Skipping install of 'cellassign' from a github remote, the SHA1 (3ce59232) has not changed since last install.
Use `force = TRUE` to force installation
We’re going to use some re-processed data from Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations (patient 1 specifically). If you git cloned this repository (http://github.com/kieranrcampbell/https://github.com/kieranrcampbell/r-workshop-march-2019) this can be found in the directory data/outs/filtered_gene_bc_matrices/GRCh38:
data_dir <- here("data", "outs", "filtered_gene_bc_matrices", "GRCh38")
print(data_dir)
[1] "/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38"
print(dir(data_dir))
[1] "barcodes.tsv" "genes.tsv" "matrix.mtx"
10X data typically has a barcodes file (indicating cell barcodes), a genes file (indicating a mapping between genes and indices) and the actual expression data as raw counts in matrix.mtx.
We can read these in to a SingleCellExperiment object using the read10xCounts function
sce <- read10xCounts(data_dir)
sce
class: SingleCellExperiment
dim: 33694 1767
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ...
ENSG00000277475 ENSG00000268674
rowData names(2): ID Symbol
colnames: NULL
colData names(2): Sample Barcode
reducedDimNames(0):
spikeNames(0):
SingleCellExperiment packageOverall idea: think of your count matrix where the columns are cells and the rows are genes.
So when you see things like rowData think geneData, and colData think cellData!
Getting data dimensions
nrow(sce) # Number of rows = genes
[1] 33694
ncol(sce) # Number of columns = cells
[1] 1767
print(sce)
class: SingleCellExperiment
dim: 33694 1767
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ...
ENSG00000277475 ENSG00000268674
rowData names(2): ID Symbol
colnames: NULL
colData names(2): Sample Barcode
reducedDimNames(0):
spikeNames(0):
sce[, c(1,3,5)] # Subset to cells 1, 3, 5
class: SingleCellExperiment
dim: 33694 3
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ...
ENSG00000277475 ENSG00000268674
rowData names(2): ID Symbol
colnames: NULL
colData names(2): Sample Barcode
reducedDimNames(0):
spikeNames(0):
sce[c(2,4,6), ] # Subset to genes 2, 4, 6
class: SingleCellExperiment
dim: 3 1767
metadata(0):
assays(1): counts
rownames(3): ENSG00000237613 ENSG00000238009 ENSG00000239906
rowData names(2): ID Symbol
colnames: NULL
colData names(2): Sample Barcode
reducedDimNames(0):
spikeNames(0):
rownames = gene names
head(rownames(sce))
[1] "ENSG00000243485" "ENSG00000237613" "ENSG00000186092" "ENSG00000238009"
[5] "ENSG00000239945" "ENSG00000239906"
ENSG00000243485
ENSG00000237613
ENSG00000186092
ENSG00000238009
ENSG00000239945
ENSG00000239906
colnames = cell names = barcodes (sometimes)
head(colnames(sce))
NULL
Column data (cell specific metadata)
head(colData(sce))
DataFrame with 6 rows and 2 columns
Sample
<character>
1 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
2 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
3 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
4 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
5 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
6 /Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
Barcode
<character>
1 AAACCTGCAGTAAGCG-1
2 AAACCTGGTCCGTTAA-1
3 AAACCTGGTTCTGTTT-1
4 AAACCTGTCCTCATTA-1
5 AAACGGGAGTAGGCCA-1
6 AAACGGGAGTGTACTC-1
<character>
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
/Users/kierancampbell/ubc/r-workshop-march-2019/data/outs/filtered_gene_bc_matrices/GRCh38
<character>
AAACCTGCAGTAAGCG-1
AAACCTGGTCCGTTAA-1
AAACCTGGTTCTGTTT-1
AAACCTGTCCTCATTA-1
AAACGGGAGTAGGCCA-1
AAACGGGAGTGTACTC-1
I might want to set the column names to the barcode:
colnames(sce) <- colData(sce)$Barcode
head(colnames(sce))
[1] "AAACCTGCAGTAAGCG-1" "AAACCTGGTCCGTTAA-1" "AAACCTGGTTCTGTTT-1"
[4] "AAACCTGTCCTCATTA-1" "AAACGGGAGTAGGCCA-1" "AAACGGGAGTGTACTC-1"
AAACCTGCAGTAAGCG-1
AAACCTGGTCCGTTAA-1
AAACCTGGTTCTGTTT-1
AAACCTGTCCTCATTA-1
AAACGGGAGTAGGCCA-1
AAACGGGAGTGTACTC-1
And I can subset on barcode:
sce[, "AAACCTGCAGTAAGCG-1"]
class: SingleCellExperiment
dim: 33694 1
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ...
ENSG00000277475 ENSG00000268674
rowData names(2): ID Symbol
colnames(1): AAACCTGCAGTAAGCG-1
colData names(2): Sample Barcode
reducedDimNames(0):
spikeNames(0):
Row data (gene specific metadata)
head(rowData(sce))
DataFrame with 6 rows and 2 columns
ID Symbol
<character> <character>
ENSG00000243485 ENSG00000243485 RP11-34P13.3
ENSG00000237613 ENSG00000237613 FAM138A
ENSG00000186092 ENSG00000186092 OR4F5
ENSG00000238009 ENSG00000238009 RP11-34P13.7
ENSG00000239945 ENSG00000239945 RP11-34P13.8
ENSG00000239906 ENSG00000239906 RP11-34P13.14
<character>
ENSG00000243485
ENSG00000237613
ENSG00000186092
ENSG00000238009
ENSG00000239945
ENSG00000239906
<character>
RP11-34P13.3
FAM138A
OR4F5
RP11-34P13.7
RP11-34P13.8
RP11-34P13.14
reducedDims is where our PCA,UMAP,tSNE representations will live - but we haven’t made them yet
reducedDims(sce)
List of length 0
names(0):
sizeFactors is where our cell size factors will live - but we haven’t calculated them yet
head(sizeFactors(sce))
NULL
The ability to have multiple assays is one of the unique advantages of the SingleCellExperiment approach - I can carry around counts, logcounts, and any other weird data transformation I might like. Right now we only have raw counts, because that’s what we’ve read in:
names(assays(sce))
[1] "counts"
counts
assay(sce, "counts")
counts(sce)
I can make my own:
assay(sce, "my_super_strange_assay") <- sin(as.matrix(counts(sce)))
names(assays(sce))
[1] "counts" "my_super_strange_assay"
counts
my_super_strange_assay
class(assay(sce, "my_super_strange_assay"))
[1] "matrix"
matrix
Note the beauty of SingleCellExperiments is that subsetting is consistent: if I want to subset only some cells and genes:
sce_subset <- sce[c(1,3,5), c(2,4,6,8)]
Then everything else is subset too!
print(dim(counts(sce_subset)))
[1] 3 4
print(length(sizeFactors(sce_subset)))
[1] 0
print(dim(rowData(sce_subset)))
[1] 3 2
So the approach may seem like a lot of work up front to just carry around a count matrix and some metadata, but this sort of consistent subsetting makes it much harder (but still not impossible) to introduce bugs into your analysis.
First we do some key things to our data:
rowData(sce)$ensembl_gene_id <- rownames(sce)
sce <- getBMFeatureAnnos(sce,
filters = "ensembl_gene_id",
attributes = c("ensembl_gene_id", "hgnc_symbol", "entrezgene",
"start_position", "end_position", "chromosome_name"),
dataset = "hsapiens_gene_ensembl")
Batch submitting query [>-------------------------] 3% eta: 1m Batch
submitting query [>-------------------------] 4% eta: 45s Batch
submitting query [=>------------------------] 6% eta: 40s Batch
submitting query [=>------------------------] 7% eta: 47s Batch
submitting query [=>------------------------] 9% eta: 43s Batch
submitting query [==>-----------------------] 10% eta: 43s Batch
submitting query [==>-----------------------] 12% eta: 40s Batch
submitting query [==>-----------------------] 13% eta: 39s Batch
submitting query [===>----------------------] 15% eta: 38s Batch
submitting query [===>----------------------] 16% eta: 36s Batch
submitting query [====>---------------------] 18% eta: 34s Batch
submitting query [====>---------------------] 19% eta: 33s Batch
submitting query [====>---------------------] 21% eta: 32s Batch
submitting query [=====>--------------------] 22% eta: 31s Batch
submitting query [=====>--------------------] 24% eta: 30s Batch
submitting query [=====>--------------------] 25% eta: 29s Batch
submitting query [======>-------------------] 26% eta: 28s Batch
submitting query [======>-------------------] 28% eta: 27s Batch
submitting query [=======>------------------] 29% eta: 27s Batch
submitting query [=======>------------------] 31% eta: 26s Batch
submitting query [=======>------------------] 32% eta: 25s Batch
submitting query [========>-----------------] 34% eta: 24s Batch
submitting query [========>-----------------] 35% eta: 24s Batch
submitting query [=========>----------------] 37% eta: 25s Batch
submitting query [=========>----------------] 38% eta: 24s Batch
submitting query [=========>----------------] 40% eta: 23s Batch
submitting query [==========>---------------] 41% eta: 23s Batch
submitting query [==========>---------------] 43% eta: 22s Batch
submitting query [==========>---------------] 44% eta: 23s Batch
submitting query [===========>--------------] 46% eta: 24s Batch
submitting query [===========>--------------] 47% eta: 24s Batch
submitting query [============>-------------] 49% eta: 24s Batch
submitting query [============>-------------] 50% eta: 25s Batch
submitting query [============>-------------] 51% eta: 25s Batch
submitting query [=============>------------] 53% eta: 24s Batch
submitting query [=============>------------] 54% eta: 23s Batch
submitting query [==============>-----------] 56% eta: 22s Batch
submitting query [==============>-----------] 57% eta: 21s Batch
submitting query [==============>-----------] 59% eta: 20s Batch
submitting query [===============>----------] 60% eta: 19s Batch
submitting query [===============>----------] 62% eta: 18s Batch
submitting query [===============>----------] 63% eta: 18s Batch
submitting query [================>---------] 65% eta: 17s Batch
submitting query [================>---------] 66% eta: 16s Batch
submitting query [=================>--------] 68% eta: 16s Batch
submitting query [=================>--------] 69% eta: 15s Batch
submitting query [=================>--------] 71% eta: 14s Batch
submitting query [==================>-------] 72% eta: 13s Batch
submitting query [==================>-------] 74% eta: 12s Batch
submitting query [===================>------] 75% eta: 12s Batch
submitting query [===================>------] 76% eta: 11s Batch
submitting query [===================>------] 78% eta: 10s Batch
submitting query [====================>-----] 79% eta: 10s Batch
submitting query [====================>-----] 81% eta: 9s Batch
submitting query [====================>-----] 82% eta: 8s Batch
submitting query [=====================>----] 84% eta: 8s Batch
submitting query [=====================>----] 85% eta: 7s Batch
submitting query [======================>---] 87% eta: 6s Batch
submitting query [======================>---] 88% eta: 5s Batch
submitting query [======================>---] 90% eta: 5s Batch
submitting query [=======================>--] 91% eta: 4s Batch
submitting query [=======================>--] 93% eta: 3s Batch
submitting query [=======================>--] 94% eta: 3s Batch
submitting query [========================>-] 96% eta: 2s Batch
submitting query [========================>-] 97% eta: 1s Batch submitting
query [=========================>] 99% eta: 1s Batch submitting query
[==========================] 100% eta: 0s
# Calculate size factors
sce <- scran::computeSumFactors(sce, BPPARAM = MulticoreParam(10))
Warning in FUN(...): encountered negative size factor estimates
# Compute log normal expression values
sce <- normalize(sce)
names(assays(sce))
[1] "counts" "my_super_strange_assay"
[3] "logcounts"
counts
my_super_strange_assay
logcounts
head(sizeFactors(sce))
[1] 0.033215558 0.071659554 0.009291482 3.408452925 2.399221881 0.106118551
sce <- runPCA(sce)
sce <- runUMAP(sce)
reducedDims(sce)
List of length 2
names(2): PCA UMAP
Just give me my PCA!
head(reducedDims(sce)[['PCA']])
PC1 PC2
AAACCTGCAGTAAGCG-1 -6.822310 -1.921039
AAACCTGGTCCGTTAA-1 -3.898654 -2.770704
AAACCTGGTTCTGTTT-1 -7.841220 -1.896710
AAACCTGTCCTCATTA-1 7.392807 2.167728
AAACGGGAGTAGGCCA-1 12.452015 -4.042698
AAACGGGAGTGTACTC-1 -4.069171 -1.635552
I like to add the symbols to the rownames:
rownames(sce) <- paste0(rowData(sce)$Symbol, "_", rownames(sce))
head(rownames(sce))
[1] "RP11-34P13.3_ENSG00000243485" "FAM138A_ENSG00000237613"
[3] "OR4F5_ENSG00000186092" "RP11-34P13.7_ENSG00000238009"
[5] "RP11-34P13.8_ENSG00000239945" "RP11-34P13.14_ENSG00000239906"
RP11-34P13.3_ENSG00000243485
FAM138A_ENSG00000237613
OR4F5_ENSG00000186092
RP11-34P13.7_ENSG00000238009
RP11-34P13.8_ENSG00000239945
RP11-34P13.14_ENSG00000239906
We next need to work out which genes are mitochondrial and ribosomal as these work well for QC:
# Get Mitochondrial genes for QC:
mt_genes <- which(rowData(sce)$chromosome_name == "MT")
ribo_genes <- grepl("^RP[LS]", rowData(sce)$Symbol)
feature_ctrls <- list(mito = rownames(sce)[mt_genes],
ribo = rownames(sce)[ribo_genes])
lapply(feature_ctrls, head)
$mito
[1] "MT-ND1_ENSG00000198888" "MT-ND2_ENSG00000198763"
[3] "MT-CO1_ENSG00000198804" "MT-CO2_ENSG00000198712"
[5] "MT-ATP8_ENSG00000228253" "MT-ATP6_ENSG00000198899"
$ribo
[1] "RPL22_ENSG00000116251" "RPL11_ENSG00000142676"
[3] "RPS6KA1_ENSG00000117676" "RPS8_ENSG00000142937"
[5] "RPL5_ENSG00000122406" "RPS27_ENSG00000177954"
MT-ND1_ENSG00000198888
MT-ND2_ENSG00000198763
MT-CO1_ENSG00000198804
MT-CO2_ENSG00000198712
MT-ATP8_ENSG00000228253
MT-ATP6_ENSG00000198899
RPL22_ENSG00000116251
RPL11_ENSG00000142676
RPS6KA1_ENSG00000117676
RPS8_ENSG00000142937
RPL5_ENSG00000122406
RPS27_ENSG00000177954
And call the calcualteQCMetrics function in scater:
sce <- calculateQCMetrics(sce, feature_controls = feature_ctrls)
datatable(head(as.data.frame(colData(sce))))
My personal favourite plot to QC scRNA-seq data:
plotColData(sce, x = "total_features_by_counts", y = "pct_counts_mito")
Typically retain cells that have < 10% mitochondrial transcripts and > 1000 features, but this is dataset dependent - for example, tumour cells typically have higher metabolic burden, leading to higher % mitochondrial (we typically use 20% as a filter then).
plotPCA(sce, colour_by = "pct_counts_mito")
Warning: 'add_ticks' is deprecated.
Use '+ geom_rug(...)' instead.
For this going to use a simple threshold of 10% mitochondrial. Importantly, we re-compute the QC metrics, size factors and normalization
mito_thresh <- 10
sce_qc <- sce[, sce$pct_counts_mito < mito_thresh]
sce_qc <- scran::computeSumFactors(sce_qc, BPPARAM = MulticoreParam(10))
sce_qc <- normalize(sce_qc)
sce_qc <- calculateQCMetrics(sce_qc, feature_controls = feature_ctrls)
sce_qc <- runPCA(sce_qc)
sce_qc <- runUMAP(sce_qc)
# plotPCA(sce_qc, colour_by = "total_features_by_counts")
plotUMAP(sce_qc, colour_by = "pct_counts_mito")
Warning: 'add_ticks' is deprecated.
Use '+ geom_rug(...)' instead.
plotScater(sce)
plotScater(sce_qc)
plotHighestExprs(sce_qc)
I can call
plotPCA(sce, colour_by = "x")plotUMAP(sce, colour_by = "x")plotTSNE(sce, colour_by = "x")where x is:
colData(sce) (= the cell specific data) to colour by metadatarownames(sce) to colour by expressionplotPCA(sce, colour_by = "SAA1_ENSG00000173432")
Warning: 'add_ticks' is deprecated.
Use '+ geom_rug(...)' instead.
plotColData(sce_qc, x = "total_counts", y = "pct_counts_mito")
plotExpression(sce_qc,
x = "total_counts",
features = "GAPDH_ENSG00000111640")
CellAssign is our new method to assign cells to known cell types. It relies on assuming cell types over-express their own markers, e.g. an epithelial tumour cell should overexpress EPCAM relative to other cell types.
library(cellassign)
In this example, the data we have just performed QC and exploratory analysis of is liver cells, that we expect to contain a certain number of cell types. To begin, we specify a list, where each item corresponds to a set of marker genes for a given cell type:
liver_marker_list <- list(
Hepatocytes = c("ALB", "HAMP", "ARG1", "PCK1", "AFP", "BCHE"),
LSECs = c("CALCRL", "FCGR2B", "VWF"),
Cholangiocytes = c("KRT19", "EPCAM", "CLDN4", "CLDN10", "SOX9", "MMP7", "CXCL1", "CFTR", "TFF2", "KRT7", "CD24"),
`Hepatic Stellate Cells` = c("ACTA2", "COL1A1", "COL1A2", "COL3A1", "DCN", "MYL9"),
Macrophages = c("CD68", "MARCO", "FCGR3A", "LYZ", "PTPRC"),
`ab T cells` = c("CD2", "CD3D", "TRAC", "IL32", "CD3E", "PTPRC"),
`gd T cells` = c("NKG7", "FCGR3A", "HOPX", "GNLY", "KLRF1", "CMC1", "CCL3", "PTPRC"),
`NK cells` = c("GZMK", "KLRF1", "CCL3", "CMC1", "NKG7", "PTPRC"),
`Plasma cells` = c("CD27", "IGHG1", "CD79A", "IGHG2", "PTPRC", "IGKC"),
`Mature B cells` = c("MS4A1", "LTB", "CD52", "IGHD", "CD79A", "PTPRC", "IGKC"),
`Erythroid cells` = c("HBB", "SLC25A37", "CA1", "ALAS2")
)
To begin, we use cellassign’s marker_list_to_mat function to convert this into a (binary) cell type by marker matrix:
mgi <- marker_list_to_mat(liver_marker_list, include_other = FALSE)
pheatmap(mgi)
We then make sure all of these markers exist in our SingleCellExperiment:
marker_in_sce <- match(rownames(mgi), rowData(sce_qc)$Symbol)
stopifnot(all(!is.na(marker_in_sce)))
And finally we subset sce to just the marker genes:
sce_marker <- sce_qc[marker_in_sce, ]
stopifnot(all.equal(rownames(mgi), rowData(sce_marker)$Symbol))
We then call cellassign passing in the SingleCellExperiment, marker info, the size factors we’ve calculated, as well as various other options:
save.image("~/Desktop/image.rds")
counts(sce_marker) <- as.matrix(counts(sce_marker))
print(dim(sce_marker))
[1] 56 331
print(dim(mgi))
[1] 56 11
fit <- cellassign(
exprs_obj = sce_marker,
marker_gene_info = mgi,
s = sizeFactors(sce_qc),
shrinkage = TRUE,
max_iter_adam = 50,
min_delta = 2,
verbose = TRUE
)
50 L old: -45341.4248612881; L new: -13086.9923911386; Difference (%): 0.71136786214427
50 L old: -13086.9923911386; L new: -12637.7934535153; Difference (%): 0.0343240772362165
50 L old: -12637.7934535153; L new: -12459.5540075771; Difference (%): 0.0141036840484733
50 L old: -12459.5540075771; L new: -12394.0136869888; Difference (%): 0.00526024611703001
50 L old: -12394.0136869888; L new: -12369.1578771128; Difference (%): 0.00200546897104786
50 L old: -12369.1578771128; L new: -12359.4793407949; Difference (%): 0.000782473343297033
50 L old: -12359.4793407949; L new: -12352.0271620321; Difference (%): 0.000602952483459608
50 L old: -12352.0271620321; L new: -12347.4391063071; Difference (%): 0.000371441518452011
50 L old: -12347.4391063071; L new: -12344.007432408; Difference (%): 0.000277925962584415
20 L old: -12344.007432408; L new: -12342.7594546598; Difference (%): 0.000101099886328378
20 L old: -12342.7594546598; L new: -12341.5678489241; Difference (%): 9.65428954589848e-05
fit
A cellassign fit for 331 cells, 56 genes, 11 cell types with 0 covariates
To access MLE cell types, call x$cell_type
To access MLE parameter estimates, call x$mle_params
Add the cell types to the sce:
sce_qc$cell_type <- fit$cell_type
plotUMAP(sce_qc, colour_by = "cell_type")
Warning: 'add_ticks' is deprecated.
Use '+ geom_rug(...)' instead.
acol <- data.frame(`cellassign cell type` = sce_qc$cell_type)
rownames(acol) <- colnames(sce_qc)
pheatmap(as.matrix(logcounts(sce_marker)),
annotation_col = acol)
pheatmap(fit$mle_params$gamma)